Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models
نویسندگان
چکیده
Abstract Graph neural networks (GNN) has been considered as an attractive modelling method for molecular property prediction, and numerous studies have shown that GNN could yield more promising results than traditional descriptor-based methods. In this study, based on 11 public datasets covering various endpoints, the predictive capacity computational efficiency of prediction models developed by eight machine learning (ML) algorithms, including four (SVM, XGBoost, RF DNN) graph-based (GCN, GAT, MPNN Attentive FP), were extensively tested compared. The demonstrate average outperform in terms accuracy efficiency. SVM generally achieves best predictions regression tasks. Both XGBoost can achieve reliable classification tasks, some models, such FP GCN, outstanding performance a fraction larger or multi-task datasets. cost, are two most efficient algorithms only need few seconds to train model even large dataset. interpretations SHAP effectively explore established domain knowledge models. Finally, we explored use these virtual screening (VS) towards HIV demonstrated different ML offer diverse VS profiles. All all, believe off-the-shelf still be directly employed accurately predict chemical endpoints with excellent computability interpretability.
منابع مشابه
A Novel Molecular Descriptor Derived from Weighted Line Graph
The Bertz indices, derived by counting the number of connecting edges of line graphs of a molecule were used in deriving the QSPR models for the physicochemical properties of alkanes. The inability of these indices to identify the hetero centre in a chemical compound restricted their applications to hydrocarbons only. In the present work, a novel molecular descriptor has been derived from the w...
متن کاملconstruction and validation of a computerized adaptive translation test (a receptive based study)
آزمون انطباقی رایانه ای (cat) روشی نوین برای سنجش سطح علمی دانش آموزان می باشد. در حقیقت آزمون های رایانه ای با سرعت بالایی به سمت و سوی جایگزین عملی برای آزمون های کاغذی می روند (کینگزبری، هاوسر، 1993). مقاله حاضر به دنبال آزمون انطباقی رایانه ای برای ترجمه می باشد. بدین منظور دو پرسشنامه مشتمل بر 55 تست ترجمه میان 102 آزمودنی و 10 مدرس زبان انگلیسی پخش گردید. پرسشنامه اول میان 102 دانشجوی س...
Learning Graph-Level Representation for Drug Discovery
Predicating macroscopic influences of drugs on human body, like efficacy and toxicity, is a central problem of smallmolecule based drug discovery. Molecules can be represented as an undirected graph, and we can utilize graph convolution networks to predication molecular properties. However, graph convolutional networks and other graph neural networks all focus on learning node-level representat...
متن کاملBuilding a Generic Graph-based Descriptor Set for use in Drug Discovery
The ability to predict drug activity from molecular structure is an important field of research both in academia and in the pharmaceutical industry. Raw 3D structure data is not in a form suitable for identifying properties using machine learning so it must be reconfigured into descriptor sets that continue to encapsulate important structural properties of the molecule. In this study, a large n...
متن کاملGraph Convolutional Neural Networks for ADME Prediction in Drug Discovery
ADME in-silico methods have grown increasingly powerful over the past twenty years, driven by advances in machine learning and the abundance of high-quality training data generated by laboratory automation. Meanwhile, in the technology industry, deep-learning has taken o↵, driven by advances in topology design, computation, and data. The key premise of these methods is that the model is able to...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Cheminformatics
سال: 2021
ISSN: ['1758-2946']
DOI: https://doi.org/10.1186/s13321-020-00479-8